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f^ We examine the application of the Variational Monte Carlo (VMC) method to a cluster model for 

^^ , halo nuclei. Particular attention is paid to the error estimate in the presence of correlations in the 

f^ 1 underlying random walk. We analyse the required steps for a reliable application of the VMC in the 

* ~*^ ' case of a complicated many-body problem such as the direct solution of the nuclear hamiltonian with 

i -~H ' realistic interactions. We also examine the possibility of variance reduction through the 'zero variance 

I , principle', paying particular attention to the complexity of the many-body problem. 
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.. , 1. Introduction and Objectives 

One of the great difficulties in nuclear physics, is the numerical solution of the many-body 
Schrodinger equation with a complicated state-dependent two-body potential. Although 
j^ ■ the theoretical background is fairly simple, a numerical evaluation of the matrix elements 

is required, due to the complexity of the many-body integrals. The variational Monte Carlo 
(VMC) ^^^ is a well known method that can be used to numerically evaluate expectation 
values, particularly when the number of variables is large, such as in the many-body prob- 
lem. 

Our interest lies in the nuclear many -body problem and in particular in the area of light 
halo nuclei. Of particular importance is the presence of correlations which result from the 
nucleon-nucleon interaction. For this reason we require a direct solution of the Schrodinger 
equation. The final wavefunction is constructed by correlating a starting wavefunction that 
describes a simple non-interacting system, i.e., a product wavefunction or Slater determi- 
nant. The full wavefunction is then given by the action of a correlation operator on this 
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Starting function. The exact form of such a wavefunction is provided by the Coupled Clus- 
ter method (CCM) ^'^. In our approximation the correlation operator is composed of the 
product of linear and non-linear parts. The linear part (where the coefficients are deter- 
mined from the eigenvalue problem) describes the long-range behaviour of the system and 
can be obtained by retaining only the linear terms in the CCM expansion of the wave- 
function. The non-linear part is a Jastrow-type correlation operator ^'^ that includes the 
short-range correlations appropriate to the short-range behaviour of the nuclear force. This 
central operator is parametrised in terms of a number of parameters that appear non-linearly 
in the variational principle. 

One important issue in light nuclei is translational invariance, which can drastically ef- 
fect the expectation value if not properly taken into account. For this reason the correlation 
operators depend on the inter-particle distances only and are thus invariant under transla- 
tions. Since the physical picture of simple halo nuclei can be assumed to be that of an 
alpha-particle (or a number of alpha-particles) accompanied by a number of weakly-bound 
nucleons, we can use as a starting function the uncorrected product of the alpha-particle(s) 
wavefunction(s) with the valence nucleon(s) ones. The alpha-particle wavefunction ^ is 
obtained by correlating a starting function which is the product of harmonic oscillator 
wave functions. This can be separated into relative and centre-of-mass parts and can thus 
preserve translational invariance. The additional nucleons in the starting function are then 
assigned coordinates relative to the alpha-particle(s) centre(s)-of-mass and the formulation 
thus preserves translational invariance. A number of additional non-linear variational pa- 
rameters enter the starting function that are related to the average separation between the 
individual constituents. Apart from the right quantum numbers the antisymmetry condi- 
tion implementing the Pauli exclusion principle must be imposed. We make use of a scalar 
correlation operator that is invariant under any exchange of particle labels and thus all sym- 
metry requirements reside in the starting function. 

Therefore, a variational solution of the many-body problem can result in a large pa- 
rameter space, where the expectation value of the Hamiltonian involves a large number of 
complicated integrals. The VMC is an efficient method that can be used for the 'statistical' 
evaluation of such expectation values. More sophisticated methods that have been success- 
fully applied in the nuclear many-body problem are the Diffusion Monte Carlo (DMC) ^ 
and Green's function Monte Carlo (GMC) methods 9 10,11,12 xhese are rather compli- 
cated methods that can provide statistically exact estimates for the expectation value of a 
Hamiltonian provided a starting wavefunction (usually obtained with VMC) is provided. 
However, the complexity of the many-body problem restricts for the time being the appli- 
cation of such methods up to medium-mass nuclei and for simple configurations and is not 
always easy to apply to halo nuclei. Although these methods are accurate they can only 
provide the numerical value of the expectation value and the particular structure provided 
by the wavefunction is not obvious. On the other hand the VMC provides an approximation 
to the expectation value (according to the choice for the trial wavefunction) and can serve 
as a starting point for further investigation. This way the structure required for the best 
description of the physical system in question can be easily identified. 

The complexity of the many-body problem requires special attention rather than a naive 



Variational Monte Carlo for Microscopic Cluster Models 

application of the VMC method. This is mainly due to the presence of correlations in 
the random walk underlying the method. Apart for complicating the error estimate, such 
correlations can impose limitations on the applicability of the VMC method. In this paper 
we provide a detailed examination of the VMC method as applied to a realistic nuclear 
many-body problem. We investigate the presence of correlations and the technical issues 
involved in the evaluation of a reliable error estimate. We shall make use of a number 
of statistical concepts, most of which can easily be found in the literature such as ^. In 
addition we describe a possible variance reduction technique that potentially can be used 
to improve the performance of VMC. 

2. The Variational Problem 

The basic equation in the VMC is the time-independent Schrodinger equation, 

H{x)^{x) ^E^{x), (1) 

where in general we approximate $(x) as 

vI/(a;)=^C„/"(x). (2) 

n 

The wavefunction is expanded in terms of a set of normalizable trial functions linear in the 
coefficients {C„} and H is the hamiltonian. In general, x denotes the set of coordinates 
appropriate for the many-body hamiltonian. However, for simplicity spin-isospin degrees 
of freedom are ignored here. Multiplying equation (1) on the left by the complex conjugate 
wavefunction and integrating over the appropriate variable, the equation takes the form 



where <Kl is the volume element. The above equation can now be written as 



(3) 



where Tikn and Mkn represent the Hamiltonian and overlap matrix elements with 

-Hkn - j IlHfndn, (5) 



A4n = fkfndn. (6) 



The variational principle states that if Eq represents the exact lowest eigenvalue of the 
particular Hamiltonian then any estimate, E, for Eq obtained from (4) will be an upper 
bound of Eq- Using this fact a particular form for ^(x) may be optimised by varying the 
adjustable parameters to minimise the expectation value of (4). The best approximation to 
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the true lowest eigenvalue is obtained by the variation of the coefficients C„. This leads to 
a set of coupled equations of the form 

/ ^ Ti-knCn — E 2_^ J^knCn — 0, (7) 

n n 

that constitute a generalised eigenvalue problem. Once the matrix elements are known, 
solution of the eigenvalue problem is straightforward. 

2.1. Error estimate 

The matrix elements entering the eigenvalue problem will be evaluated numerically (as 
discussed in the next section), leading to an error in the estimated eigenvalue. In case where 
an error estimate for individual matrix elements exists, the total error for the eigenvalue 
problem of (7) can be obtain from the linear perturbation of the eigenvalue problem 

Y^i^kn + STiknKC'n + 6Cn) ^ {E + SE) J^i^kn + SMkn){Cn +6Cn), (8) 
n n 

where SE is the unknown error. Multiplying on the right by the same eigenvector and 
keeping only first order terms leads to 

SE = j {CkSHknCn - ECkSNknCn) , (9) 

with summation convention implied. If the Hamiltonian and overlap matrix errors were 
independent of each other the right hand side of the above equation could have been taken 
in quadrature giving a value for the maximum possible error, AE, that is the required error 
for the numerical calculation : 

AE^ j ^{CkAnknC'n)^ + {ECkAMknCa)^. (10) 

However, the above equation can lead to an incorrect error estimate since in reality the 
errors in the hamiltonian and overlap matrix elements are likely to be correlated (depending 
on the method used for their evaluation). A way of dealing with this problem is through 
the covariance matrix, which can be used to define a set of uncorrelated (independent) 
observables whose errors can be added in quadrature. The elements of the covariance 
matrix are co\{BzB^i ), defined as 

cow(B,B^,) = {B,B^j) - {B,}{B^,}, (11) 

with Bz € {{Hz}, {Nz}}, while the angular brackets denote an expectation value. This 
gives a real symmetric matrix of the form 

/ a^iHii) ... cov{HiiHnn) cov(HiiiVn) ... cov(iJiiiV„„) \ 



C = 



COw(HnnHii) ... (J^{Hnn) COv(i7„„iVii) ... COv(0„„A^„„) 

cov{NiiHn) ... cov(iViiiJ„„) a^Nn) ... cov(iVii7V„„) 

\ COv(Ar„„iJii) ... COv{NnnHnn) C0v(iV„„7Vii) ... Cr^(iV„„) / 

(12) 
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with dimensions (2n^) x (2n^), where nx n is the dimension of the hamiltonian and overlap 
matrices. The diagonal elements correspond to the variance of the hamiltonian and overlap 
matrix elements cr^ that is discussed later. Diagonalising the covariance matrix is equivalent 
to obtaining a new set of uncorrected observables that each is a linear combination of the 
old ones. This new set of uncorrected observables, {Qz}, can then be defined from the 
eigenvectors of C as 



Bz ~ y ^RzkQk, 

k 


(13) 


y\Czk^kz = A,R,„ 


(14) 


C = RAR^. 


(15) 


;tor of C, while A = diag(Ai, . 


., A2„2). The new (indepen- 



dent) observables satisfy the condition 

cow(QkQz) = 5kzK, (16) 

i.e. their covariance vanishes and their variance is equal to the eigenvalues of C. The error 
associated with the new observables is their standard deviation, which according to the 
above equation can be found as 

(Ag,)2 = ^, (17) 

with N denoting the number of samples taken for the observable. This way equation (10) 
can be written as 



AE 



(-^kJ^kn^-^n 



. E I E ^^(Rff..fe - ERm,k}C, I (AQfc)2, (18) 



where the errors corresponding to the independent set of observables, {Qk}, can be added 
in quadrature. 

3. Variational Monte Carlo and Error Estimate 

The generalised eigenvalue problem of equation (7) requires the computation of the Hamil- 
tonian and overlap matrix elements which in general are complicated integrals and one may 
wish to choose Monte Carlo sampling to perform the integration. The matrix elements of 
equations (5) and (6) can be written as 

A/fen = / ax w{x) , (20) 

J \ w{x) y 
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where w(x) plays the role of the probability density function provided that is definite pos- 
itive and shares the same symmetries as the full wave function. The eigenvalue problem 
leads to 

^C„ / dx w{x)Hkn{x) = E'^Cn / dxw{x)Nkn{x,) 

n •' n •' 

J2iHkn)C„ = Ej2{Nkn)C„, (21) 

n n 

which can be viewed as the expectation value of two distinct arbitrary functions of a random 
variable (Hkn and Nkn) taken from the same probability distribution. 

Because of the parametrization of the wavefunction in terms of the unknown coeffi- 
cients C„ a single probability distribution function appropriate for each individual matrix 
element cannot be extracted. Although in principle is possible to sample each matrix el- 
ement based on its own distribution leading to n^ distinct Monte Carlo algorithms (or 
"^"^ ' for a symmetric matrix), time requirements make such a calculation impractical. A 
way around this difficulty is to sample all of the matrix elements based on a single proba- 
bility distribution. Although this approach reduces the number of required simulations, it 
can lead to a large variance for individual matrix elements and special care is required. 

3.1. Statistical sampling 

In Monte Carlo methods, (Hkn) or (Nkn) are estimated using a large but finite set of 
values for x (a finite set of configurations {xi} for the multi-dimensional case) distributed 
according to w. In order to avoid referring to individual matrix elements we consider the 
expectation value of a general operator O which is denoted by (O). 

One well-known method of obtaining the required distribution is the Metropolis algo- 
rithm and this is the one we shall use. In order to generate values of the variable x having 
the required distribution the Metropolis algorithm makes use of a random walk: An initial 
point (set of coordinates), xq, is chosen and subsequent points are generated in steps by 
moving in a random direction, but each time within a prescribed radius from each individ- 
ual coordinate. Not all points, however, are kept but an 'accept or reject' method is used. 
This means that every point xi in the random walk belonging to the ^th step, is weighted 
against the previous point, xi^i, and is either chosen or rejected. The decision process is 
given by the Metropolis algorithm which generates a Markov chain. In order to fulfil the 
criteria for a Markov chain it is essential for each step that the decision process govern- 
ing the evolution of the random walk should not have any dependence on configurations 
belonging to previous steps. This means that the decision of keeping or rejecting point xi 
should only depend on the value of the point a;;_i and not on any previous points. An 
equivalent statement would be to say that the set of values, {0{xi)}, chosen to form the 
average should be uncorrelated with each other 

The random walk provides as with a statistical average that is different from the ex- 
act expectation value, but can be used as an approximation. The expectation value (O) 
corresponds to the average of the quantity O over an infinite ensemble of statistically in- 
dependent trials. This average can in principle be obtained exactly if integrals similar to 
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(19) and (20) can be evaluated analytically. The random walk that is actually performed 
in simulations provides an average over a finite sequence of measurements. This sample 
average or mean will be denoted by O. In the case of N samples taken from a distribution, 
the expectation value is approximated as 

(O) « O, 

1 ^ 

where the Xi represents the set of appropriate coordinates that are distributed according to 
a probability density w(x). 

3.2. Error estimate and correlations 

Provided we have a statistical average composed of N uncorrected samples taken from an 
arbitrary probability distribution, the central limit theorem states that in the limit of large 
A^, the above average follows a normal distribution with mean (O) and variance 



2 



1 



with 



^' = j;^< (23) 

<jI = (02) _ (0)2. (24) 

However, in a Monte Carlo sampling we are limited to a finite sample average O and 
we would like to know how to estimate the error of such an average which is the deviation 
from (O) (the limiting case provided by the central limit theorem). The theoretical value 
of this error can be obtained by considering an ensemble of independent random walks, 
where we can introduce the notion of the expected value of a sample average, (O), as well 
as that of the expected value of any single measurement (Oi = 0{xi)) corresponding to 
a particular step i of the random walks, denoted by (Oi). The demand that the individual 
measurements belong to a Markov chain and must in principle satisfy 

(O,) = (0,> = (0> = (0>, (25) 

meaning that all samples or measurements in the random walk are independent of their 
position relative to the others. The fact that there is a correlation between individual mea- 
surements corresponds to the case where 

{0^0j)^{0,){0,) (26) 

When the above is taken into consideration the variance of the mean becomes 
a'iO) = ((0-(0))2) 
= {{0-{0)r) 

= (o')-(o>^ 

= -^J2((0^0,)~m{0,)). (27) 
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This describes the deviation of the calculated mean from the theoretical expectation value. 

However, in practise it is impossible to have an algorithm which can generate sam- 
ples that are completely uncorrelated. This is indeed the case of the Metropolis algorithm 
which is used to generate the Markov chain. By the nature of the algorithm used the indi- 
vidual measurements are not statistically independent. Such correlations between statistical 
measurements have in general two effects. First, they reduce the number of independent 
measurements from the total number of performed measurements, hence the calculation 
converges more slowly. Second, the estimation for the statistical error will have to incorpo- 
rate the effect of such correlations otherwise any error estimate will be an underestimate. 
As for any numerical method a correct error estimate is of crucial importance for the va- 
lidity of the results. The rest of this section is devoted to the analysis of the error estimates 
for correlated data. Most of the results are taken from ^^ and ^^. 

In case that (26) does not hold the equation for the variance of O reduces to that of 
equation (23). However, when (26) holds the true variance for the mean can be written as 

13 



a\0) 



1 

iV 



CTO 



AT-l 

t=i 



t 

N 



Pt 



(28) 



where 



Pt 



{o,o,)-m{o,) t^\z~ji 



{O^O, 



i+tl 



m{o, 



i+tl 



(29) 



Furthermore, the presence of correlations also influences the value of the covariance be- 
tween the means of two different quantities, O and O , since 

cov(oo') = {dd)-{d){d) 
1 






= T^yMo^oA^mm 



1 



70- 



Af-l 

^E 

t=i 



N 



It 



where similarly to equation (29) we define 



(30) 



(31) 



An essential assumption for both pt and 74 is the invariance under 'time translations', mean- 
ing that only the separation between the measurements is of importance and not their place 
in the random walk (something also included in (25)), i.e. 



{0,0,) 

{o[o,) 



{0,0]) . 



(32) 



As we have seen earlier a reliable error estimate requires two quantities. Firstly, an es- 
timate of the error for each individual matrix element, something provided by the variance 
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of the mean for the particular matrix element. Secondly, an estimate for the covariance be- 
tween different matrix elements, so that a set of uncorrected observables can be obtained. 
In theory both of these objects are provided by pt and jt ■ However, in a practical simulation 
only approximate measurements can be made. 



3.3. Estimation of variance and covariance 

An estimate for pt and jt can be obtained through the auto- and cross-correlation coeffi- 
cients. The auto-correlation coefficient, Ct is defined in the case of N samples as 



CtiO) 



N - 



— ^(0,-0)(0,+,-0), 



(33) 



while the cross-correlation coefficient as 






(34) 



The variable t will be refereed to as the correlation time. These two coefficients provide 
biased estimators for pt and 7^, in the sense that they underestimate the actual values. This 
is expressed as 



(C,(0)) = pt-o\0) + ^u 
{Ct{0,0')) = jt-a\0) + A[, 



where the terms A^ and Aj are given as 



N-t N 



N-t N 



At = 2 cov(00 



N(N ~t) ^^ ^^ 



m^Ty 



(35) 
(36) 



(37) 



(38) 



with 

7., = (o.o,)-(o.)(o,), 

7; = (o.o;-)-(o.)(o;.). 

However, in most applications the largest correlation time in pt and jt is finite, meaning 
that equations (28) and (30) can be approximated by 



a'{0) . 


1 

N 


' 2 n^f. t\ ' 
*=1 ^ ^ . 


5 


(39) 


cov(0,o') R 


1 

N 


r T , 

cov(0,o') + 2) \2 1 


-^)^* 


(40) 
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The meaning of the above approximation for a random walk is that the correlation between 
different samples is of finite range in the sense that (OiOj) — {Oi){Oj) and {OiO'A — 
{Oi){0'j) become zero for large enough correlation time t = \i — j|. The parameter T in 
the above equations represents a cutoff parameter and is the maximum correlation time that 
will be taken into account. The significance of a finite correlation time is that the biases 
Ai and A^ in equations (35) and (36) will become arbitrarily small for sufficiently large 
number of samples n. 

Provided that -^ is sufficiently small, the variance and covariance can be approximated 
by 



a^(0) 



cov(0, 



N 
+ 2Ef=iCt 



TV 

The above equations provide as with a way of measuring the strength of correlations in a 
particular simulation through the 'normalised' correlation coefficients, Ci/ctq and Ct/Co- 
These can be obtained for a particular simulation as a function of the correlation time t. 

3.4. Application and results 

In order to examine the accuracy of the variance estimate provided by equation (41) a 
simple one dimensional problem was considered. A hamiltonian of the form 

was used, where the exact value of the lowest eigenvalue is known and is equal to — i. 
We can approximate this value through a variational problem in terms of the eigenvalue 
equation (7). A Gaussian non-orthogonal expansion of the form 

/„-e-^"-'. (44) 

was used to approximate the wave function with the {/i„} belonging to a geometric series. 
Since a numerically exact result can obtained for this approximation, we can also use Monte 
Carlo sampling in order to verify the error estimate. The probability density function was 
taken to be the square of the first Gaussian. Since we can obtain the numerically exact 
values for Hkn and Afkn we can use this to construct an unbiased estimator for the variance 
and thus to obtain an uncorrelated estimate for the variance of each matrix element. For 
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example the variance for the Hamiltonian matrix elements is given as 



N 



iCHkn - Ekn) ) 

n 
2^(^L - Ekn) 



(45) 



where Ekn corresponds to the exact value while the summation is over a number of distinct 
random walks with 'H\^ denoting the distinct average obtained in the ith walk consisting of 
N samples. The approximation symbol becomes an equality in the limit of large n. For our 
simple one dimensional model the convergence is relatively fast and the results obtained 
can be referred to as statistically exact. Simultaneously we can obtain the value of the 
variance through the biased estimator of (41) for any one of the performed averages. 



0.18- 



biased estimator 
numerically exact 




20 



Fig. 1 . The (biased) variance estimate for different matrix elements of tlie Hamiltonian matrix as a function of the 
correlation time t, for a simple one-dimensional model. The fact that the biased estimate approaches a constant 
value with respect to t indicates that there is a cutoff in the correlation coefficients (as shown previously). The 
dotted lines represent the value for the variance obtained through an unbiased measurement. 

The statistically exact value of the variance and that of the biased estimator depending 
on the correlation time are shown in figure 1, which is a demonstration for the accuracy of 
the variance estimate. Provided that the correlation time is large enough, the value for the 
variance obtained from equation (41) agrees with that of the statistically exact value of (45). 
Although the statistically exact value cannot be obtained directly for practical calculations 
(it requires the analytical results), the biased estimator (41) becomes statistically exact for 
large t, provided there is a cutoff in the correlation. 

When the maximum correlation time is relatively small an unbiased error estimate can 
be obtained without having to consider the correlation coefficient. This is done by rejecting 
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a number of the samples taken and only considering those which are adequately separated 
in simulation time and thus uncorrelated. Since the samples are consecutively obtained 
from each other their separation in simulation time is exactly equivalent to the correlation 
time. If we call the maximum correlation time T, then each sample is correlated with 
samples separated from it by at most T steps. Therefore, if from the total number of 
samples obtained only the ones separated by at least T steps (or random walk moves) are 
considered, this new set of samples will be uncorrelated. In the literature these intermediate 
moves are usually referred to as 'thermalisations' ^, where is expected that a relatively small 
number is adequate. However, for more complicated models, the maximum correlation 
time might be too large for either completely removing the correlations or estimating the 
effect of the correlation coefficient without any intermediate moves. 

In order to get an idea about the correlation coefficient for the many-body problem 
we can consider the linearised approximation of the many-body wavefunction in terms of 
correlations operators acting on a starting function i^'^^. This provides a translationally 
invariant description of the many-body problem and was applied to a number of closed- 
shell systems in terms of an alpha-cluster model ^^. We firstly consider the ground-state of 
the alpha-particle. The many body Hamiltonian is of the form 

H = J2^^'+ E ^fe)' (46) 

where V{ij) is a realistic nucleon-nucleon interaction that is composed of a sum of Gaus- 
sian functions : 

k 

with rij being the distance between the the ith and jth particles. The wavefunction ^I* is 
given as 

* == i^$o, (48) 

where $o represents the wavefunction for the uncorrelated four-particle system that we 
take to be the product of harmonic oscillator wavefunctions. F is a correlation operator 
that is approximated by a linear expansion of the form 

N / N \ 

f{ij) = cxp{dkrfj), g{ij) = fci exp(-Air,2j) + /ca exp(-A2r,2j). (50) 

This type of wavefunction (refered to as the J-TICI(2) scheme) has been extensively used 
for the alpha-particle is 19,20,21 ^\iqj^q [^ yy^g shown to provide an adequate description for 
the ground-state properties. The probability density function w was taken to be the square 
of one of the components in the expansion of the wavefunction i.e 

w = (Fo*o)'. (51) 
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Overlap matrix 




40 50 10 
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Fig. 2. The vaiiance (lower part) and the nomialised auto- and cross-coiTelation coefficients (upper part) as a 
function of coirelation time. These results are for the matrix elements of the hamiltonian and overlap matrices of 
the alpha-particle calculation. The variance graphs start from a value belonging to the unbiased estimator and 
increase as the estimator becomes biased by including the effects of the conelation coefficient. 
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This is a natural choice for the pdf since the wavefunction can always be written as the 
product of w and a function ^'. 

The variance of equation (41) and the normalised auto and cross-correlation coeffi- 
cients of (33) and (34) were sampled as functions of correlation-time. This was done for 
the matrix elements of both the hamiltonian and overlap matrices. The result is shown in 
figure 2. We can see in the lower part of the figure that the variance strongly depends on the 
correlation coefficient, starting from a minimum (unbiased estimator) and finally converg- 
ing. According to the previous analysis this indicates that despite the fact that the variance 
depends on the correlation time, there is a cutoff in the correlation coefficient, which im- 
plies that the dependence on the correlation coefficient will be over a restricted range of 
the correlation time (approximately about the cutoff). This is backed up by sampling the 
correlation coefficient (upper graphs of figure 2), where we can see that the value of the 
normalised correlation coefficient decays rapidly as the correlation time increases. Accord- 
ing to the figure we can safely assume 50 samples as the value of the cutoff, something that 
can be compensated by taking 50 intermediate moves in the random walk. 

The alpha-particle wavefunction is spherically symmetric and depends on 12 variables 
(4x3 cartesian coordinates). It is an easy numerical calculation both in terms of computa- 
tion time and complexity. We can go one step further by introducing one or two additional 
nucleons, corresponding to the cases of ^He and ^He. We shall do so by considering 
scalar interactions that do not mix the spatial and spin-isospin terms. The inclusion of 
additional nucleons leads to two complexities. On one hand the wavefunction must be an- 
tisymmetrized, something that does not imply a straight forward choice for the pdf w and 
on the other hand the nuclear potential is composed of several parts including spin, isospin 
and spin-isospin exchange terms. The wavefunction has the form 



= ^C^A{^^Xar}, (52) 



where the correlation operator has been approximated as before, while X(yT is the tensor 
product of the individual spin and isospin terms. A represents an antisymmetrization op- 
erator. The linear correlation operator is invariant under permutation of particle labels and 
can be taken outside the antisymmetrizer. Special attention is required for the antisym- 
metrization operator, otherwise the number of terms entering the wavefunction will be too 
many for any practical calculation. For simplicity we do not discuss the quantum numbers 
corresponding to angular momentum, total spin and total isospin, having in mind that the 
wavefunction must be an eigenstate of good quantum numbers. The function $o is not 
the uncorrected harmonic oscillator but is modified to include the motion of the additional 
nucleon(s) relative to the alpha-particle. In general the function $0 depends on a number 
of non-linear variational parameters that are related to the average separation between the 
individual clusters. 
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The norm of the wavefunction can be written as 

i 

= fc^c,Cj(Aff*»)(Aff*j), (53) 

where fc is a constant and Agff is an 'effective antisymmetrizer' that is only composed of a 
much smaller number of permutations than A. This provides a relatively simple pdf w, for 
example 

w=|Aff$o|- (54) 

This is not the natural choice as in the case of the alpha-particle (or any totally symmetric 
wavefunction) but is rather artificial. We made use of this type of pdf for going beyond 
the alpha-particle. Figure 3 illustrates the variance estimation for the ^He calculation. In 
the case of the hamiltonian matrix the correlation coefficient decays rather rapidly and 
we get a similar result to the alpha-particle i.e. the estimation of the variance converges 
within a relatively small correlation time, something that can be compensated by taking a 
number of intermediate moves. However, in the case of the overlap matrix we see that the 
correlation coefficient has a very slow decay so that the variance estimate does not converge 
with a reasonable correlation time that can be compensated with intermediate moves alone. 
Furthermore, if we go beyond the ^He into heavier systems such as ^Be we observe the 
same behaviour for the Hamiltonian matrix elements. 

Therefore, in order to obtain a converging estimate for the variance it might be more 
convenient to make use of both intermediate moves and the unbiased estimator of (41). For 
example by introducing a small number of intermediate moves we can force a cutoff to the 
correlation coefficient and thus obtain a converging variance through equation (41). This 
is illustrated in figure 4 for the correlation coefficient of the ^He overlap matrix: without 
any intermediate moves the correlation coefficient decays rather slowly, while after taking 
a number of intermediate moves a reasonably small cutoff is obtained. 

It must be noted that the more samples are discarded or the largest the correlation 
time is, the more time consuming the simulation becomes. Therefore, in general the most 
efficient approach that can guarantee a correct variance estimate is to make use both of 
intermediate moves and the correlation coefficient. 

4. Variance Reduction 

Although correlations can be taken into account or even removed from the sampling of 
a particular observable, this can only provide an algorithm with a reliable error estimate. 
In order to get an adequate error a large number of samples are required, the number de- 
pending on the particular observable. The so called 'zero-variance principle' is a way of 
increasing the efficiency of a Monte Carlo algorithm by reducing the variance. In gen- 
eral the principle is to replace each observable to be computed by an improved estimator 
which has the same mean but a different variance. The method is described in ^^ where 
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Fig. 3. The variance (lower part) and the nomialised auto- and cross-correlation coefficients (upper part) as a 
function of correlation time. These results are for the matrix elements of the hamiltonian and overlap matrices of 
the ^He calculation. The variance graphs start from a value belonging to the unbiased estimator and increase as 
the estimator becomes biased by including the effects of the correlation coefficient. 
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Fig. 4. The correlation coefficient before and after taking intermediate moves (indicated by the arrow). The 
presence of intermediate moves introduce a reasonably small cutoff. 
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applications of the zero-variance principle were shown to be very powerful. This variance 
reduction technique is examined in order to establish its usefulness for the case of many- 
body cluster models. 

4.1. Method 

We consider the expectation value of an observable Okn given as 

(Ofcn) = / w(x)Okn(x)dx, (55) 

where as usual w{x) represents a normalised probability density function (pdf). In our 
case (Okn) corresponds to a particular matrix element of either the Hamiltonian or overlap 
matrices, where for practical purposes all expectation values are over the same single pdf. 
This implies that we can optimise w{x) for at most one of the matrix elements. 
We introduce the 'renormalized' observable Okn where 



ht 



kn 



^kn — ^krt 

= Okn + Okn (56) 

with h and ipkn representing a trial operator and a trial function appropriate for the expec- 
tation of each individual matrix element. In order for the expectation value of Okn to be 
the same as that of the original observable the operator h must satisfy the condition 



hy/H^ = (57) 

This leads to the equation 

hlpkn = {(Okn) - Okn)\/w, (58) 

which has the consequence that the error of the calculation, <j{Okn), vanishes. Although 
in principle equation (58) should be satisfied, in practise an approximate solution can be 
searched that minimises the variance of the calculation. The variance for the expectation 
value of the new operator of equation (56) now becomes 

T'(Ofe„) - a'^iOkn) + (T^iOkn) + 2{{OknOkn) ~ (Ofe„)(Ofe„)). (59) 

A possible way to approximate the function ipkn would be to consider a finite expansion 
linear in some coefficients like in the case of the wavefunction ^ (equation (2)). This means 
that V'fcn can be given as 

V;fc„-^Af"5„ (60) 

i 

where the functions {g^} are taken to be common for all matrix elements, while the coef- 
ficients {A*^"} are determined by minimising the variance of a particular matrix element, 
in this case the observables {Okn)- The condition that the variance is minimised can be 
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imposed by demanding that upon variation of the coefficients A*'" the value of the variance 
reaches a minimum : 

5a^{dkn) = (61) 



dOkn 



= 0, VA™. (62) 



This leads for each matrix element to an equation of the form 

^4"Ay +/ff"=0, (63) 



where 



Therefore, for a particular set of functions {gi}, the coefficients { A*^"} can then be obtained 
as 

4"^-^(A)T/xf". (66) 

i 

One difficulty that might arise is when one or more of the eigenvalues of A is near zero 
(especially when it is zero within the numerical accuracy of the inversion procedure), in 
which case solving equations (63) by inverting A will lead to numerical instabilities. Di- 
agonalising A is equivalent to obtaining linear superpositions of the functions {gi} that 
are uncorrected with respect to the operator -^. However, this might not be possible 
for the entire set {gi}. This problem can be bypassed by solving (63) with singular value 
decomposition, which is equivalent to considering a reduced set of functions {gi}. 

By substituting the coefficients back to equation (59) the variance of each particular 
matrix element becomes 

ij 

= <T2(Ofe„)-^4fc«AyAf, (67) 

ij 

which is always bound to be less than cr^(Ofe„), whatever the choice for the g^'s due to the 
fact that the eigenvalues of A are always positive. 

An efficient method of obtaining an appropriate set of functions gi is to run a number 
of short Monte Carlo algorithms until a set that reduces the variance of the calculation ef- 
ficiently is found. The set of coefficients {Ai} can be obtained from a short sampling and 
used as input for a larger computation. 
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Note: The error for the zero variance calculation can also be obtained from a set of uncor- 
related errors giving an equation similar to (18). The only difference is in the covariance 
matrix to be calculated. The new matrix elements are 

Cij ^ cow {BiBj), 

= cow{[B, + B,][Bj+Bj]), 

= coy{B,B/) + Y, KK + E ^nK'i + Y. ^j,A^™A„™, (68) 

n n nm 

where i is a collective index representing a particular matrix element of either the Hamil- 
tonian or overlap matrix ({Onm, Nnm} = {Bi}), while A and K are defined in equations 
(64) and (65). 

4.2. Application and results 

Although the variance reduction technique can in principle reduce the variance of an ob- 
servable, for practical calculations we are confined to the linear approximation of equation 
(60). For a complicated many-body problem the choice of the functions {gi} entering (60) 
is not at all obvious. For our calculations we shall use the same set as the one used for 
the linear eigenvalue problem. This is the simplest available choice. We can examine the 
applicability of this approach through the same one dimensional problem as before before 
going to the many-body case. In this case the integrals of equation (67) can be solved in 
a numerically exact manner without having to make use of Monte Carlo sampling. The 
numerically exact results for the variance of the one-body problem are shown in figure (5). 
In this case the variance reduction greatly reduces the variance of the matrix elements and 
can lead to a zero variance (within the numerical accuracy of the machine used). 

We can then proceed and apply the same method to the case of the alpha-particle in the 
J-TICI(2) approximation, this time using Monte Carlo sampling. Figure 6 is the variance of 
a few of the hamiltonian matrix elements as functions of the number of linear components 
used in equation (60). Although there is a substantial reduction in the variance, the effect 
is not as strong as in the one-dimensional case (not a zero-variance principle any more) 

Although there is a substantial reduction in the variance of the alpha-particle calculation 
(about 90%), this is not a reduction that can be of practical help. Having in mind that the 
error is given by the standard deviation (that is the square root of the variance) we have that 
its value changes with the number of samples as -j=. A 90% reduction is (approximately) 
equivalent to an error dependence of the form —K^ . This is not much of an error reduction, 
particularly when compare with the simple one-dimensional case. For the 'zero variance' 
principle to be valuable we require a radical variance reduction. Therefore, the variance 
reduction for the alpha particle is not sufficient for aiding the numerical calculation. A 
similar situation was also observed in the case of the more complicated systems of ^He and 
^He. Although a variance reduction was possible it was not of a substantial contribution to 
the simulation. Furthermore, the amount of reduction was considerably lower than that for 
the alpha-particle. This is not unexpected since the wavefunction is more complicated and 
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Fig. 5. The variance of some matrix elements of the hamiltonian matrix for the one-dimensional problem as a 
result of applying the zero variance principle. The variance was plotted against the number of components used 
to approximate the trial function. 
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Fig. 6. The variance of some matrix elements of the hamiltonian matrix for the alpha-particle in the J-TICI(2) 
approximation as a result of applying the 'zero-variance' principle. The variance is plotted against the number of 
components used to approximate the trial function. 
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the issue of antisymmetrization is also present. 

It seems that the complicated structure of the many-body wavefunction requires a dif- 
ferent kind of approximation of (58) than the simple linear one, for any effective reduction 
of the variance. 

5. Summary and Conclusions 

The variational Monte Carlo method is an important ingredient for cluster-like models since 
it allow as to investigate different structures in the cluster model without having to worry 
about the analytical solution of many-body integrals. This requires the numerical method 
to be both accurate and precise in the error estimate. Furthermore, is a starting point for 
more sophisticated methods such as GMC. 

In our cluster model the VMC in is applied to the generalised eigenvalue problem, 
where the total error requires to decorrelate the individual matrix elements of the hamilto- 
nian and overlap matrices. This requires knowledge of the co variance matrix, where the 
diagonal elements coincide with the variances. This is a straight forward task that does not 
impose any difficulty for the many-body calculations. 

We have illustrated that the knowledge of the correlation coefficient is crucial for a cor- 
rect estimate of the variance (or covariance), particularly for complicated systems i.e. the 
simulation variance is different from the analytical one due to the presence of correlations. 
This requires to make use of a biased variance estimator, where the bias becomes arbitrarily 
small as the number of samples increases. For such an estimator to be practical we require 
that the correlation coefficient has a reasonable cutoff, i.e. that a particular sample in the 
random walk is only correlated up to a finite number of previous samples. In the case where 
the cutoff is relatively large it can be reduced by discarding a number of samples between 
the values taken. The most efficient approach that can guarantee a correct variance estimate 
is to make use both of intermediate moves and the correlation coefficient. 

We also examined a variance reduction technique that can be used to improve the ef- 
ficiency of a calculation, the so called 'zero variance principle'. This was taken from ^^. 
The principle is in general complicated to apply and we only considered a linear approxi- 
mation. This sufficiently decreased the variance of the alpha particle calculation. For more 
complicated systems a different kind of approach is required. In principle we could have 
looked for a more complicate approximation than the one at hand, but this is beyond our 
purpose. It seems that variance minimisation for the many-body problem is not a straight 
forward matter, particularly when antisymmetrization is involved. 

In general we have analysed the application of the VMC method for the nuclear many- 
body problem, and in particular for the case of the generalised eigenvalue problem. We 
discussed and analysed the required steps for a reliable error estimate. 
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